This IPython notebook is designed to explore the SOM (self-organizing map) model. The whole notebook may be run at once:
Kernel -> Restart from the menu above.Cell -> Run All.Alternatively, you may run each cell in sequence, starting at the top of the notebook and pressing Shift + Enter.
The next three cells define the SOM model in its entirety, increasing the retinal and cortical densities to suitable values:
"""
Basic example of a fully connected SOM retinotopic map with ConnectionFields.
Contains a Retina (2D Gaussian generator) fully connected to a V1
(SOM) sheet, with no initial ordering for topography.
Constructed to match the retinotopic simulation from page 53-59 of
Miikkulainen, Bednar, Choe, and Sirosh (2005), Computational Maps in
the Visual Cortex, Springer. Known differences include:
- The cortex_density and retina_density are smaller for
speed (compared to 40 and 24 in the book).
- The original simulation used a radius_0 of 13.3/40, which does work
for some random seeds, but a much larger radius is used here so that
it converges more reliably.
"""
import topo
import imagen
from math import exp, sqrt
import param
from topo import learningfn,numbergen,transferfn,pattern,projection,responsefn,sheet
import topo.learningfn.projfn
import topo.pattern.random
import topo.responsefn.optimized
import topo.transferfn.misc
# Parameters that can be passed on the command line using -p
from topo.misc.commandline import global_params as p
p.add(
retina_density=param.Number(default=10.0,bounds=(0,None),
inclusive_bounds=(False,True),doc="""
The nominal_density to use for the retina."""),
cortex_density=param.Number(default=10.0,bounds=(0,None),
inclusive_bounds=(False,True),doc="""
The nominal_density to use for V1."""),
input_seed=param.Number(default=0,bounds=(0,None),doc="""
Seed for the pseudorandom number generator controlling the
input patterns."""),
weight_seed=param.Number(default=0,bounds=(0,None),doc="""
Seed for the pseudorandom number generator controlling the
initial weight values."""),
radius_0=param.Number(default=1.0,bounds=(0,None),doc="""
Starting radius for the neighborhood function."""),
alpha_0=param.Number(default=0.42,bounds=(0,None),doc="""
Starting value for the learning rate."""))
The cell below may be used to modify the parameters defined above, allowing you to explore the parameter space of the model. To illustrate, the default retinal and cortical densities are set to the value of 10 (their default values) but you may change the value of any parameter in this way:
p.retina_density=10
p.cortex_density=10
param.Dynamic.time_dependent=True
# input pattern
sheet.GeneratorSheet.period = 1.0
sheet.GeneratorSheet.phase = 0.05
sheet.GeneratorSheet.nominal_density = p.retina_density
input_pattern = pattern.Gaussian(scale=1.0,size=2*sqrt(2.0*0.1*24.0)/24.0,
aspect_ratio=1.0,orientation=0,
x=numbergen.UniformRandom(name='xgen', lbound=-0.5,ubound=0.5, seed=p.input_seed+12, time_dependent=True),
y=numbergen.UniformRandom(name='ygen', lbound=-0.5,ubound=0.5, seed=p.input_seed+56, time_dependent=True))
# Local variable to allow weights to be controlled easily
pattern.random.seed((p.weight_seed+67,p.weight_seed+89))
topo.sim['Retina'] = sheet.GeneratorSheet(input_generator=input_pattern)
topo.sim['V1'] = sheet.CFSheet(
nominal_density = p.cortex_density,
# Original CMVC simulation used an initial radius of 13.3/40.0 (~0.33)
output_fns=[transferfn.misc.KernelMax(density=p.cortex_density,
kernel_radius=numbergen.BoundedNumber(
bounds=(0.5/40,None),
generator=numbergen.ExponentialDecay(
starting_value=p.radius_0,
time_constant=40000/5.0)))])
topo.sim.connect('Retina','V1',name='Afferent',delay=0.05,
connection_type=projection.CFProjection,
weights_generator = pattern.random.UniformRandom(),
nominal_bounds_template=sheet.BoundingBox(radius=1.0), # fully connected network.
learning_rate=numbergen.ExponentialDecay(
starting_value = p.alpha_0,
time_constant=40000/6.0),
response_fn = responsefn.optimized.CFPRF_EuclideanDistance_opt(),
learning_fn = learningfn.projfn.CFPLF_EuclideanHebbian())
'Loaded the self-organizing map model (som_retinotopy)'
Now the model is loaded, we shall explore it, following the structure of the original tutorial.
The structure of the loaded model is shown in this screenshot taken from the Tk GUI's Model Editor:
The large circle indicates that these two Sheets are fully connected.
The plot below shows the initial set of weights from a 10x10 subset of the V1 neurons:
topo.sim.V1.Afferent.grid()
As we can see, the initial weights are uniform random. Each neuron is fully connected to the input units, and thus has a 24x24 array of weights as shown above, or a 10x10 array if using the default (reduced) density.
We will be visualizing the center-of-gravity (CoG) of the V1 Afferent weights using the following measurement command:
from topo.command.analysis import measure_cog
cog_data = measure_cog()
The center-of-gravity (a.k.a. centroid or center of mass) is computed using the weights for each neuron. The plot below shows each neuron represented by a point with a line segment is drawn from each neuron to each of its four immediate neighbors so that neighborhood relationships (if any) will be visible.
topo.sim.V1.views.maps.CoG
From this plot is is clear that all of the neurons have a CoG near the center of the retina, which is to be expected because the weights are fully connected and evenly distributed (and thus all have an average (X,Y) value near the center of the retina). This plot presents the center-of-gravity values computed in the X and Y directions, shown below:
topo.sim.V1.views.maps.XCoG.N + topo.sim.V1.views.maps.YCoG.N
The V1 X CoG plot shows the X location preferred by each neuron, and the V1 Y CoG plot shows the preferred Y locations. The monochrome values are scaled so that the neuron with the smallest X preference is colored black, and that with the largest is colored white, regardless of the absolute preference values (due to normalization being enabled with the .N property). Thus the absolute values of the X or Y preferences are not visible in these plots. (Without normalization, values below 0.0 are cropped to black, so only normalized plots are useful for this particular example.)
blue_channel = topo.sim.V1[:] # Convenient way to generate a zero channel (blue)
(topo.sim.V1.views.maps.XCoG.N * topo.sim.V1.views.maps.YCoG.N * blue_channel).rgb
The colorful plot above shows a false-color visualization of the CoG values, where the amount of red in the plot is proportional to the X CoG, and the amount of green in the plot is proportional to the Y CoG. Where both X and Y are low, the plot is black or very dark, and where both are high the plot is yellow (because red and green light together appears yellow). This provides a way to visualize how smoothly the combined (X,Y) position is mapped, although at this stage of training it is not particularly useful.
This two plots below shows the response for each Sheet in the model, which is zero at the start of the simulation (and thus both plots are black).
topo.sim.Retina[:] + topo.sim.V1[:]
We can have a look at what the training patterns that will be used to train the model without having to run it. In the cell labelled In [4] (in the model definition), we see where the training patterns are defined in a variable called input_pattern. We see that the circular Gaussian stimulus has x and y values that are drawn from a random distribution by two independentnumbergen.UniformRandomobjects. We can now view what 100 frames of training patterns will look like:
imagen.Animation(pattern=input_pattern, frames=100, offset=0.05)
At the end of the notebook, we will generate a set of nice animations showing the plots we have already shown evolve over development. We now create a Collector object that collects all information needed for plotting and animation. We will collect the information we have just examined and advance the simulation one iteration:
from topo.command.analysis import Collector
from topo.command.analysis import measure_or_pref, measure_cog
c = Collector()
with c.run(1):
c.collect(topo.sim.V1)
c.collect(measure_or_pref)
c.collect(measure_cog)
c.collect(topo.sim.V1.Afferent, rows=10,cols=10)
c.collect(topo.sim.V1.Afferent, activity=True)
After running the model a single iteration, the sheet activities now look as follows:
topo.sim.Retina[:] + topo.sim.V1[:]
In the Retina plot, each photoreceptor is represented as a pixel whose shade of grey codes the response level, increasing from black to white. As expected, this particular example matches the first frame we visualized in the training stimulus animation and the location of the Gaussian is near the border of the retina. The V1 plot shows the response to that input, which for a SOM is initially a large Gaussian-shaped blob centered around the maximally responding unit.
To see what the responses were before SOM’s neighborhood function forced them into a Gaussian shape, you can look at the Projection Activity plot, which shows the feedforward activity in V1:
topo.sim.V1.Afferent.projection_view().N
Here these responses are best thought of as Euclidean proximity, not distance. This formulation of the SOM response function actually subtracts the distances from the max distance, to ensure that the response will be larger for smaller Euclidean distances (as one intuitively expects for a neural response). The V1 feedforward activity appears random because the Euclidean distance from the input vector to the initial random weight vector is random.
If you look at the weights now we have run a single training iteration, you'll see that most of the neurons have learned new weight patterns based on this input:
topo.sim.V1.Afferent.grid()
Some of the weights to each neuron have now changed due to learning. In the SOM algorithm, the unit with the maximum response (i.e., the minimum Euclidean distance between its weight vector and the input pattern) is chosen, and the weights of units within a circular area defined by a Gaussian-shaped neighborhood function around this neuron are updated.
This effect is visible in this plot – a few neurons around the winning unit at the top middle have changed their weights. Let us run a few more iterations before having another look:
topo.sim.run(4)
topo.sim.V1.Afferent.grid()
The weights have been updated again - it is clear after these five iterations that the input patterns are becoming represented in the weight patterns, though not very cleanly yet. We also see that the projection activity patterns are becoming smoother, since the weight vectors are now similar between neighboring neurons.
topo.sim.V1.Afferent.projection_view().N
We will now advance to simulation time 100 because we will want to make animations regularly sampled every 100 steps:
topo.sim.run(95)
Let us use our Collector object c to collect more measurements. We will start collecting measurements every 50 steps till we complete 5000 iterations:
with c.run(100, cycles=49):
c.collect(topo.sim.V1)
c.collect(topo.sim.Retina)
c.collect(measure_cog)
c.collect(topo.sim.V1.Afferent, rows=10,cols=10)
c.collect(topo.sim.V1.Afferent, activity=True)
We see that the topographic grid plot now looks as follows:
c.V1.Afferent.CoG.top
The X and Y CoG plots are now smooth, but not yet the axis-aligned gradients (e.g. left to right and bottom to top) that an optimal topographic mapping would have:
c.V1.Afferent.XCoG.top.N + c.V1.Afferent.YCoG.top.N
The false-color visualization has also smoothed outL
(topo.sim.V1.views.maps.XCoG.top.N * topo.sim.V1.views.maps.YCoG.top.N * blue_channel).rgb
The weight patterns are still quite broad and not very selective for typical input patterns:
c.V1.Afferent.CFGrid.top
Additional training up to 10000 iterations (which becomes faster due to a smaller neighborhood radius) leads to a flat, square map:
with c.run(100, cycles=50):
c.collect(topo.sim.V1)
c.collect(topo.sim.Retina)
c.collect(measure_cog)
c.collect(topo.sim.V1.Afferent, rows=10,cols=10)
c.collect(topo.sim.V1.Afferent, activity=True)
c.V1.Afferent.CoG.top
The weight patterns are still quite broad and not very selective for typical input patterns:
c.V1.Afferent.CFGrid.top
By 30000 iterations the map has good coverage of the available portion of the input space:
with c.run(100, cycles=200):
c.collect(topo.sim.V1)
c.collect(topo.sim.Retina)
c.collect(measure_cog)
c.collect(topo.sim.V1.Afferent, rows=10,cols=10)
c.collect(topo.sim.V1.Afferent, activity=True)
c.V1.Afferent.CoG.top
The final projection plot at 30000 now shows that each neuron has developed weights concentrated in a small part of the input space, matching a prototypical input at one location:
c.V1.Afferent.CFGrid.top
For this particular example, the topographic mapping is flipped about the diagonal relative to the retina. Nothing in the network drives it to have any particular overall orientation or mapping apart from aligning to the square shape, and the map may turn out to be flipped or rotated by 90 degrees along any axis with equivalent results.
We can watch how the topographic mapping unfolds over all 30000 iterations we have run:
c.V1.Afferent.CoG
And how the XCoG and YCoG components unfold:
(c.V1.Afferent.XCoG.normalize_elements() + c.V1.Afferent.YCoG.normalize_elements())
As well as the false colour CoG plot:
(c.V1.Afferent.XCoG.normalize_elements() * c.V1.Afferent.YCoG.normalize_elements() * blue_channel).rgb
The weights (generating this animation may take a couple of minutes):
c.V1.Afferent.CFGrid
Snapshots of the retinal and V1 activity over development. Notice how the activity in V1 becomes more focused over time as the kernel radius decreases:
c.Retina.Activity + c.V1.Activity
Now, you can re-run the basic simulation by restarting the IPython Notebook Kernel (Kernel -> Restart) or using the keyboard Shortcut Ctrl+M . (see Help -> Keyboard Shortcuts for the full list). Then you can change one of the parameter values, either by editing the model definition about before running those cells, or by setting the defined parameters in the cell labelled In [2].
For instance, the starting value of the neighborhood radius (from which all future values are calculated according to exponential decay) is 1.0. You can change this value as you see fit, e.g. to 0.1, by setting p.radius_0=0.1. With such a small learning radius, global ordering is unlikely to happen, and one can expect the topographic grid not to flatten out (despite local order in patches).
Similarly, consider changing the initial learning rate from 0.42 to e.g. 1.0 (e.g. by setting p.alpha_0=1.0 in cell 2). The retina and V1 densities cannot be changed after the simulation has started, but again, they can be changed by providing their values on the command line as above (or by editing the som_retinotopy.ty file) and starting Topographica again.
You can also try changing the input_seed (p.input_seed=XX), to get a different stream of inputs, or weight_seed (p.weight_seed=XX), to get a different set of initial weights. With some of these values, you may encounter cases where the SOM fails to converge even though it seems to be working properly otherwise. For instance, some seed values result in topological defects like a ‘kink’: